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Aim: The purpose of this study is to evaluate the association of the pre-internship Objective Structured 
Clinical Examination (OSCE) in final year medical students with comprehensive written examinations. 
Subjects and material: All medical students of October 2004 admission who took part in the October 2010 
National Comprehensive Pre-internship Examination (NCPE) and pre-internship OSCE were included in the 
study (n = 130). OSCE and NCPE scores and medical grade point average (GPA) were collected. 
Results: GPA was highly correlated with NCPE (r =0.76 and P <0.001) and moderately with OSCE (r =0.68 
and P <0.001). Similarly a moderate correlation was observed between NCPE and OSCE scores (r =0.6 and 
P < 0.001). Linear stepwise regression shows r of a model applying GPA as predictor of OSCE score is 0.46 
(P =0.68 and P <0.001), while addition of gender to the model increases r to 0.59 (P =0.61 and 0.36, for 
GPA and male gender, respectively and P<0.001). Logistic forward regression models shows male gender 
and GPA are the only dependent predictors of high score in OSCE. OR of GPA and male gender for high 
OSCE score are 4.89 (95% CI =2.37-10.06) and 6.95 (95% CI =2.00-24.21), respectively (P <0.001). 
Discussion: Our findings indicate OSCE and examination which mainly evaluate knowledge, judged by GPA 
and NCPE are moderately to highly correlated. Our results illustrate the interwoven nature of knowledge and 
clinical skills. In other words, certain level of knowledge is crucial for appropriate clinical performance. Our 
findings suggest neither OSCE nor written forms of assessments can replace each other. They are 
complimentary and should also be combined by other evaluations to cover all attributes of clinical 
competence efficiently. 
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Medical education is an essential investment that 
involves much trainings and many clinical 
disciplines and programs. These programs are 
aimed at producing competent graduates who are able to 
perform appropriate history taking, comprehensive phy- 
sical examinations, problem-solve, order and interpret 
essential paraclinic evaluations, arrive at a diagnosis, and 
outline a management plan. 



Considering the significance of this investment, it is of 
vital importance to evaluate how this investment is 
paying off. In this regard, the evaluation of the medical 
students at certain training points plays a prominent role 
for the students and their trainers (1, 2). 

Traditionally, multiple-choice questions, oral examina- 
tions, and tutor reports were frequently used in the 
evaluation of medical students. However, oral evaluations 
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and ward assessments lack practical reliability and 
validity (3-6). Although multiple-choice questions have 
an appropriate reliability, they are limited by the fact that 
they only assess one dimension of the students' compe- 
tencies which is clinical knowledge (5-7). Therefore, 
evaluation of competencies and higher cognitive skills 
of undergraduate and graduate medical students, by 
means of OSCE has been more emphasized in recent 
years (8, 9). For example, the medical councils of 
Canada, Japan, and Korea employ OSCE in their 
licensing examination and the National Board of Medical 
Examiners incorporates use of OSCE into the US 
Medical Licensing Step 3 Examinations (10, 11) and 
nearly all medical schools in United States have reported 
use of OSCE in their regular evaluations (12). 

Medical education in Iranian Universities is divided 
into four periods: basic sciences (five semesters), intro- 
duction to clinical medicine (two semesters), clinical 
clerkships (five semesters), and internship (18 months). 
Before being allowed to continue their medical studies, 
students are supposed to pass two national comprehen- 
sive examinations after finishing their basic sciences 
period (national comprehensive basic sciences examina- 
tion) and clinical clerkships (national comprehensive pre- 
internship examinations). Both of these examinations are 
composed of multiple-choice questions and are con- 
ducted twice a year by the Ministry of Health and 
Medical Education. In 2009, Tehran University of 
Medical Sciences established pre-internship OSCE to 
evaluate the clinical skills of the medical students entering 
internship. 

Despite appointing studies on the value of the OSCE in 
graduate and undergraduate settings, there are inconclu- 
sive findings regarding the relationship of the OSCE with 
other means of evaluation. More specifically it is not 
clearly known how different knowledge-based evaluations 
versus OSCE rank students. Similarly, it is not clear which 
kind of stations can assist OSCE in providing better 
divergent validity. The aim of the current study is to: 

(1) Evaluate the reliability and validity of the pre- 
internship OSCE. 

(2) Assess the association between traditional written 
examinations and OSCE scores in an Iranian 
medical school. 

(3) Determine the stations which mostly contribute to 
divergent validity of OSCE. 

Subjects and materials 
Study population 

All medical students of October 2004 admission who 
took part in the October 2010 National Comprehensive 
Pre-internship Examination (NCPE) and pre-internship 
OSCE were included in this study (« =130). 



Study measures 

OSCE settings 

The OSCE comprised 12 stations in four circuits, with 5 
min at each station. The total examination took 8 hours. 
Each station evaluated one or more aspects of clinical 
competencies, including history taking, physical exam- 
ination, communication skills, interpretation of labora- 
tory findings, generating differential diagnosis, and 
management. Scoring was done by a single examiner at 
nine stations based on a prepared checklist (in history 
taking, psychiatric interview, neurologic examination, 
ophthalmologic examination, management of preterm 
rupture of membranes (PROM), arterial blood gas 
sampling, stitching, adult cardiopulmonary resuscitation, 
and orthopedic procedure of splinting). Three stations 
(chest X-ray findings in Mitral Stenosis [MS]), Treatment 
and prognosis of Kerion, and evaluation of child growth 
curve) were unmanned; the students recorded their 
findings at these stations, subsequently their recordings 
were evaluated by a single examiner. At five stations 
(history taking, psychiatric interview, neurologic exam- 
ination, ophthalmologic examination, and orthopedic 
procedure of splinting) a standardized patient played a 
role. Communication skills and knowledge items in 
history taking station included establishing a rapport, 
acting respectfully, and identifying patients concerns, 
involving the patient in decision-making process, and 
planning for management. In Kerion station, description 
of the lesion, making differential diagnosis, identifying 
most probable diagnosis, requesting appropriate labora- 
tory tests, providing appropriate treatment were ques- 
tioned. In chest X-ray finding station, examinees were 
asked about chest X-ray findings as well as complications 
and treatment of MS. In PROM station, students 
answered questions regarding management of PROM. 
In psychiatric interview examinees were supposed to 
inquire about risk factors of suicide including marital 
status, carrier, past medical and psychiatric history, and 
medications of a standardized patient attempting suicide. 
In ophthalmologic examination, examinees were sup- 
posed to assess visual acuity, pupillary light reflex, and 
choose appropriate treatment for a standardized patient 
with eye trauma. In other stations, related skills were 
examined. Examiners were trained prior to the exam and 
were not involved in the design of the station. Standar- 
dized patients were trained in three 45 min individual 
sessions and were equipped by written instructions 
1 week prior to the examination. A description, explain- 
ing each station was written and placed at the door of 
each station and 1 min was assigned for reading them. 

On the day of the OSCE, the examinees were given 
a 30 min orientation. In this session, the structure of 
the examination was reviewed and an opportunity to 
ask questions was provided. 
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Questionnaires to assess the content validity of the 
OSCE were distributed to the physician examiners (as lay 
experts), asking them to rate the stations they had 
observed in compliance with the importance of each 
station, adequacy of time, appropriateness of standar- 
dized patients' actions and capacity of check lists to 
assess stations' objectives. 

National Comprehensive Pre-internship Examination 
(NCPE) scores and medical grade point average (GPA) 
NCPE scores and grade point averages (GPA) were 
kindly provided by the education deputy of the school 
of medicine. The highest attainable scores for NCPE and 
medical GPA were 200 and 100 respectively. 

Statistical analysis 

Data were analyzed with SPSS vl3.0 for Windows (SPSS, 
INC., Chicago, IL, USA). Descriptive statistics, such as 
mean, median, count, range, and standard deviation (SD 
were used to describe the characteristics, GPA, NCPE, 
and OSCE scores of the students. Reliability was 
calculated for the overall examination and each station 
by means of internal consistency statistic Cronbach's 
alpha. Independent T-test and ANOVA were used to 
assess the differences among groups. Pearson's correla- 
tions were employed to assess the associations between 
the NCPE score, GPA, and other quantitative variables 
with the OSCE scores. Variables correlated with OSCE 
score with P <0.2 were selected for regression. NCBSE 
scores, GPAs, gender, and age were employed as inde- 
pendent variables in linear and logistic regression models 
to predict the OSCE scores (dependent variable). For 
logistic models, the students were grouped to high and 
low OSCE scores (n = 65 and 65, respectively) and further 
analysis was performed on these groups. Probability value 
of less than 0.05 was considered significant. 

Ethical considerations 

The study protocol was approved by TUMS Research 
Ethics Committee. 



Results 

Reliability and validity of OSCE 

Of 130 students included in the study, 76 (58.46%) were 
female. Median age of the participants was 24 ranging 
from 23 to 26. Table 1 summarizes the internal consis- 
tency of the checklists at different stations of the OSCE 
and correlation between scores of each station of the 
OSCE and total OSCE score excluding that station from 
the total score. 

The content validity of the OSCE is shown in Table 2. 
No significant difference was detected in the score of the 
students entering different circuits (P=0.08). 



Relationship of OSCE scores with comprehensive 
examinations 

Mean OSCE score achieved by the students was 56 
(22.44-80.08) out of total score of 120. Table 3 compares 
the OSCE and NCPE scores, as well as GPAs of the 
students according to gender. As this table presents, there 
was no significant difference in GPAs and NCPE scores 
of female and male students. However, male students 
performed better in the OSCE (Table 3). 

GPA was highly correlated with NCPE (r=0.76 
and P <0.001) and moderately with OSCE (r =0.68 and 
P < 0.001). Similarly, a moderate correlation was observed 
between NCPE and OSCE (r=0.6 and P < 0.001 )(Data 
are not listed in the tables). Linear stepwise regression 
denoted r 2 of a model applying GPA as the predictor of 
the OSCE score was0.46 (P =0.68 and P <0.001), while 
the addition of gender to the model increased r 2 to 0.59 
(P =0.61 and 0.36, for GPA and male gender, respectively, 
and P < 0.001). NCPE score was removed from both 
models due to high levels of collinearity. Logistic forward 
regression models showed that male gender and GPA were 
the only dependent predictors of a high score in the OSCE. 
OR of GPA and male gender for high OSCE scores- 
were4.89 (95% CI =2.37-10.06) and 6.95 (95% CI =2.00- 
24.21), respectively (P <0.001). 

Divergent validity of OSCE stations 

Table 4 summarizes the association of each station of the 
OSCE with the NCPE score and medical GPA. The 
highest correlation between NCPE score and OSCE 
stations was observed in treatment and prognosis of 
Kerion, management of PROM, and chest X-ray findings 
in MS stations (r=0.5, 0.38, and 0.35, respectively). 
Correspondingly, three stations of treatment and prog- 
nosis of Kerion, chest X-ray findings in MS, and 
management of PROM showed the highest correlation 
with GPA of the participants (r=0.46, 0.37, and 0.36, 
respectively). No significant correlation was evident 
between NCPE score and arterial blood gas sampling, 
evaluation of child growth curve, stitching, and adult 
cardiopulmonary resuscitation stations (Table 4). 

Discussion 

This study illustrates that OSCE is a reliable and valid 
method for evaluation of medical students' clinical 
competencies. Previous studies have expressed a wide 
range of reliability from 0.12 to 0.89 for OSCE, indicating 
that there liability of the OSCE is setting dependent (3, 8, 
9, 13-17). In our settings, the reliability of OSCE was 
similar to the standard of 0.8 which is comparable to 
written examinations (18, 19). In agreement with our 
findings, it is suggested that OSCEs with fewer than 10 
stations might lack the ability to incorporate all necessary 
materials to even superficially cover reasonable measures 
of clinical competency, therefore reducing their validity 
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Table 1. Description of the OSCE stations and their reliability (number of participants = 130) 



Station Description 


Type of Station 


Cronbach's alpha 


P value 


r 


P value 


History taking 




u.y I 


< U.UU I 


n a 7 


< U.UU I 


Psychiatric interview 




n qo 


^ n nm * 
< U.UU 1 


U.OO 


< U.UU I 


Management of preterm rupture of membranes 


p/pc 
r/ro 


U.oo 


< U.UU 1 


U.DU 


< U.UU I 


Neurologic examination 


D 

r 


n qo 

u.oy 


< U.UU 1 


U.ol 


< U.UU1 


Ophthalmologic examination 


P/PS 


0.83 


<0.001* 


0.31 


< 0.001* 


Chest x-ray findings in mitral stenosis 


PS 


0.75 


<0.001* 


0.45 


< 0.001* 


Treatment and prognosis of Kerion 


P/PS 


0.82 


<0.001* 


0.53 


< 0.001* 


Evaluation of child growth curve 


PS 


0.75 


<0.001* 


0.34 


< 0.001* 


Arterial blood gas sampling 


S 


0.82 


<0.001* 


0.26 


0.002* 


Stitching 


S 


0.80 


<0.001* 


0.52 


< 0.001* 


Adult cardiopulmonary resuscitation 


s 


0.88 


<0.001* 


0.52 


< 0.001* 


Orthopedic procedure of splinting 


s 


0.85 


<0.001* 


0.57 


< 0.001* 



H, History taking; C, Communication skills; P, Physical examination; PS, Problem solving; S, Skill; r, item-total test score correlation. 
'Significant. 



and reliability (20-22). Our findings justify the use of 
OSCE at particular points of medical education. 

Our findings reveal that OSCE and examinations 
which mainly evaluate knowledge, judged by GPA and 
NCPE, are moderately to highly correlated. This exposes 
the interwoven nature of knowledge and clinical skills. In 
other words, a certain level of knowledge is crucial for 
appropriate clinical performance. In line with our find- 
ings, Simon et al have concluded that second-year 
medical students' OSCE scores were moderately corre- 
lated with USMLE Step 2 scores (23). Accordingly, 
Muller et al demonstrated a moderate correlation be- 
tween clinical skills and USMLE Step 2 (24). A similar 
association was reported in dental students' or residents' 
OSCE and written examinations (9, 25, 26). On the other 
hand, the high correlations observed in the aforemen- 
tioned studies indicate the low divergent validity of 
OSCE. In other words, clinical skills should be more 
weighted in OSCE to increase the divergent validity of 
this exam so that OSCE can provide some information 
which is not evaluated by knowledge-based written forms 
of examinations. In concordance with this finding, a 
detailed evaluation of OSCE stations pointed out that 
stations in which knowledge is more emphasized, such as 



treatment and prognosis of Kerion, management of 
PROM, and chest X-ray findings in MS, showed a higher 
correlation with medical GPA and NCPE scores. In 
contrast, arterial blood gas sampling, stitching, adult 
cardiopulmonary resuscitation, and evaluation of child 
growth curve stations, which focused more on clinical 
skills, were the main sources of the divergent validity of 
OSCE. Notably, in this study, 5 min was considered for 
the students to accomplish their tasks in each station. 
This short time may also lead to reducing the divergent 
validity of OSCE. Prolonging the duration of the stations 
to 15 or 20 min can potentially provide the examinees 
with the opportunity of demonstrating their clinical skills 
in a superior quality, therefore improving the divergent 
validity of OSCE; however, there is some evidence which 
indicates this may not significantly change the scores of 
the students in knowledge-based stations (27). 

Our findings show that although no significance 
difference is observed in the knowledge of the students 
with different gender, male students performed much 
better in the OSCE. This is in contrast to the findings of 
some studies that implied female students tend to have a 
better performance in clinical examinations (23, 28-30). 
Although most of these studies were conducted on 



Table 2. Content validity of the OSCE (n =31) 



Question 


Strongly agree% 


Agree % 


Neutral % 


Disagree% 


Strongly disagree% 


Checklist items were precise and clear 


16.1 


64.5 


16.1 


3.2 


0 


Checklist items were in accordance with station objectives 


20 


63.3 


13.3 


0 


3.3 


Station objectives were prevalent in daily practice 


48.4 


35.5 


12.9 


3.2 


0 


Station objectives were imperative in daily practice 


48.4 


45.2 


6.5 


0 


0 


Time of the station was adequate 


19.4 


51.6 


6.5 


22.6 


0 


Standardized patients acted properly 


25 


55 


20 


0 


0 
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Table 3. OSCE, NCPE scores, and GPAs of the students (n = 130) a 






Gender 




OSCE Stations 


Total 


Female (n = 76) 


Male {n =54) 


P value 


History taking 


O. / + 1 .O 


5.9 + 1.3 


5.5 + 1.3 


n oh 


Psychiatric interview 


D ± 1 


6.1 ±1 


5.9 + 0.9 


n *\A 

KJ.OH 


Management of preterm rupture of membranes 


A Q _LO 


4.1 ±2 


5.7 + 1.5 


< U.UU1 


Neurologic examination 


CO i 1 7 
O.O T I./ 


5 + 1.7 


5.7 + 1.6 


U.Uo 


WpillMdIIIIUIUyiU cXdll 1 1 1 IdUOl 1 




1.7 + 2.1 


2 + 2.1 


n f,e, 

U.DO 


Chest x-ray findings of mitral stenosis 


O A _L H C 

tiA + 1 .b 


1.9 + 1.4 


3 + 1.7 


<U.(JU ] 


Treatment and prognosis of Kerion 


3.4 + 2 


2.7 + 1.9 


4.5 + 1.9 


<u.uLn 


Evaluation of child growth curve 


3.7 + 1.7 


3.9 + 1.7 


3.4 + 1.9 


0.18 


Arterial blood gas sampling 


6.6 + 1.5 


5.8 + 1.2 


7.3 + 1.6 


<0.001* 


Stitching 


6.5 + 2.1 


6.5 + 2.1 


6.3 + 2.1 


0.46 


Adult cardiopulmonary resuscitation 


6.8 + 1.7 


6.8 + 1.9 


6.9 + 1.6 


0.8 


Orthopedic procedure of splinting 


2.4 + 2.5 


1.3 + 1.7 


4 + 2.5 


<0.001* 


Total OSCE score 


56 + 10.1 


53.7 + 9.8 


59.3 + 9.7 


0.002* 


NCPE 


129.76 + 21.52 


127.72 + 19.76 


132.55 + 23.62 


0.21 


GPA 


82.3 + 5.4 


81.6 + 5.2 


83.55 + 5.65 


0.11 



a Data presented as mean + standard deviation. 
*Significant. 



written examinations, some studies have pointed out the 
superior performance of female students in clinical skills 
as well (31). In contrast, other studies have indicated no 
or minimal gender difference in the scores of students 
(and only in certain stations of OSCE) which may not 
seriously influence the performance of the students in 
reality (32, 3 3). We can postulate that better performance 
of male students in practical settings which is observed in 
this study can be attributed to lower levels of anxiety or 
higher levels of self-confidence or probable fewer social 



interactions or communication skills of female students 
in our culture; however, further studies are paramount to 
support these speculative explanations. 

An important beneficial aspect of the OSCE is provid- 
ing students and faculty members with feedback. Review- 
ing the group performance of the students in this study 
revealed that the students performed poorly in orthopedic 
procedure skills, ophthalmologic examinations, and chest 
X-ray findings in MS. The significance of this finding is 
further enhanced by the evidence suggesting faculty 



Table 4. Association of each OSCE station with medical GPA and NCPE score (n = 130) 



NCPE 



GPA 





r 


P value 


r 


P value 


History taking 


0.29 


0.001* 


0.28 


0.008* 


Psychiatric interview 


0.23 


0.008* 


0.10 


0.34 


Management of preterm rupture of membranes 


0.38 


<0.001* 


0.36 


<0.001* 


Neurologic examination 


0.20 


0.02* 


0.16 


0.12 


Ophthalmologic examination 


0.27 


0.002* 


0.21 


0.04* 


Chest x-ray findings of mitral stenosis 


0.35 


< 0.001* 


0.37 


<0.001* 


Treatment and prognosis of Kerion 


0.50 


<0.001* 


0.46 


< 0.001* 


Evaluation of child growth curve 


0.14 


0.11 


0.29 


0.006* 


Arterial blood gas sampling 


0.08 


0.33 


-0.02 


0.83 


Stitching 


0.16 


0.06 


0.33 


0.002* 


Adult cardiopulmonary resuscitation 


0.21 


0.013 


0.24 


0.02* 


Orthopedic procedure of splinting 


0.33 


<0.001* 


0.43 


<0.001* 



*Significant. 
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expectations might lack correspondence to the actual 
competencies of the students (21) and will facilitate faculty 
members in developing a more competency-based curri- 
culum. Despite the promising possibilities of the OSCE, 
one should also take into account that the OSCE is labor- 
intensive. It requires faculty time, standardized patient 
recruitment and training, administrative costs, quality 
control, security control, etc. 

There were several limitations in current study: First 
the impact of standardized patients and examiners on 
students' performance was not evaluated correspond- 
ingly. Second, our study is limited to a single medical 
school. Therefore, our findings may not be generalized 
perfectly. Moreover, TUMS is ranked as the best medical 
school in Iran, therefore usually highly motivated and 
talented students with limited variation in their compe- 
tencies are admitted into this university. This suggests 
that, generalization of our results should be done more 
cautiously. Third, in this OSCE, standardized patients, as 
well as various standard media and medical models were 
utilized, so our findings may not be applicable for the 
OSCEs that mainly use standardized patients. Fourth, 
although we assessed the academic performance of the 
students, it is widely acknowledged that a powerful 
academic background does not necessarily warrant a 
successful professional career. Furthermore, in interpret- 
ing our results, it is important to consider that OSCE has 
limited ability in measuring the real performance of the 
students in authentic situations (34, 35). Future studies 
should evaluate the role of the OSCE and other 
examinations on clinical performance of graduated 
doctors. 

Our findings suggest neither the OSCE nor the written 
forms of assessments can replace each other. These 
examinations are complementary and they should be 
combined with other methods of evaluations to cover all 
attributes of clinical competence efficiently. 
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